CS234 Final Project - #edtwt: Eating Disorder Communities on Twitter

December 13, 2021
Name: Olivia Giandrea

Table of Contents

  1. Introduction
  2. Preparing the Twitter API
  3. Gathering Tweets
  4. Formatting the Response
  5. Sentiment Analysis
  6. Limitations
  7. Conclusion

Part 1: Introduction

"edtwt," or "Eating Disorder Twitter" is a group of users that discuss their experiences with a variety of eating disorders, including (but not limited to) anorexia nervosa, bulimia, binge eating disorder, and other undiagnosed conditions, or 'ednos'.

Eating disorders have disproportionally affected young women, and cases have heavily increased over the past few years. Online communities which encourage eating disorders (ED) are extremely dangerous. Social media has made it easier for these pro-ED messages to spread, reaching vulnerable populations ranging from healthy individuals who may be influenced to engage in ED behaviors to individuals who may already have an ED. Recent research suggests a link between viewing online ED content and engaging in offline ED behavior, and the age group most affected by ED are the young users of platforms like Twitter.

Some platforms have attempted to censor dangerous content like this, but it can result in 'over-warning' users with 'triggering' hashtags or topics, even those who are seeking help or support. The warnings and interventions can therefore quickly lose their efficacity and even cause more harm than good. It's important that we truly understand the nature of online communities before introducing interventions to avoid issues like this. In particular, we need to know more about the types of ED-related information people using social media are being exposed to.

In this study, I collected Tweets over a roughly-weeklong period from Twitter to investigate the 'daily' conversations of edtwt. I used the hashtags used in this study by Dawn B. Branley and Judith Covey, which they scraped from the website www.hashtagify.me, to filter the tweets I searched using Twitter's API.

Part 2: Preparing the Twitter API

Twitter's official API allows developers and researchers to collect tweets, as well as information about them and their authors, to study the platform's handling of data and communities. I signed up for a developer account, and used this API to collect my tweets.

As mentioned earlier, I filtered Twitter's 'search' endpoint using the hashtags from Branley and Covey's 2017 paper.

In order to use Twitter's API, I had to sign in with my unique authorization tokens. We can begin by defining these, as well as importing a few modules we'll need to get these tweets.

Using these authorizations, we can build our Twitter API URL.
We can use our bearer_token to authorize ourselves in the header of the API URL, and then define which datapoints we want returned in our response.

Once we have authorized ourselves, we can actually request the tweet data from the Twitter API, using the requests library and the URL we just built. If we are paginating our results, we will use the next_token to access the next page of tweets.

This is the complicated part. Using the JSON data from the connect_to_endpoint URL, we will loop through each tweet, grab the desired datapoints, and append each tweet's new data to a CSV file.

Part 3: Gathering Tweets

Using all the helper functions defined previously, we'll actually gather our tweets now:

  1. Set up our inputs
  2. Connect to the URL
  3. While we have tweets to scrape, process them
  4. Write our data to the CSV file tweetData.csv

I originally printed out endpoints to see whether each request connected or not and the total number of tweets, but I have commented out those print statements to save space in the notebook.

Part 4: Formatting the Response

We'll read the results from the CSV file, so we don't need to scrape again in the future, and store them in a dataframe for easy analysis.

And clean it a little bit, like removing the tweet texts' concatination and resort by post date descending.

So, we were able to scrape 12,020 tweets to work with from 2021-12-06 to 2021-12-12.

Part 5: Sentiment Analysis

What can we learn from this data?

Table of Contents

  1. Piechart
  2. Polarity
  3. WordClouds
  4. Vectors

1: Piechart

First, we'll import the modules we'll need.

percentage is a helper function to calculate percentages.

We'll first convert our df to a dictionary to perform the analysis, check how many tweets we have, and the content of one of these tweets.

We can analyse the content of each tweet, as demonstrated above, to determine if it's positive, negative, or neutral surrounding the topic of EDs using sentiment analysis. For each tweet, we'll score it, then add it to the appropriate list depending on the tweet's sentiment score.

Our total tweets of each category:

And a piechart, as a visual:

Let's print a few examples of each, to see what the classifier was focusing on:

Hm. Positive tweets still don't seem that positive - "fatspo" is a phrase usually accompanied by an image of a "fat" person, used as motivation to continue fasting or stay thin. The second tweet, however, mentions that the user has recovered and is now at a healthy weight. So, there's a strange mix.

Let's look at the negative examples:

These tweets are certainly more negative than the positive tweets: swearing, cries for help, etc.

And the neutral:

These are difficult to code: the first tweet, for example, shows the user is upset because of their AN diagnosis, but this is only known from the emoji. If the emoji wasn't there, this tweet could easily be interpreted as a simple statement or even a positive thing, in the user's eyes, if they are desperately trying to lose weight.

Now that we've explored our tweet examples a bit further, let's approach the sentiment analysis from a new angle using the polarity and subjectivity categories.

2: Polarity

We'll start by cleaning our tweets and building a new dataframe without emojis, text signaling a retweet or usernames, and unnecessary /n or /t characters so our analysis can be more accurate. We'll start by defining a list of words we will want to replace. In the cleaning process, contractions are replaced with a space in the middle, so we'll fix that.

Now, we'll first clean the tweet text of RT, usernames, punctuations, and emojis. When we do so, our contractions will be cleaned as well, so we'll fix those next.

Excellent! Now our analysis may be a little more accurate. We will next calculate the negative, positive, neutral, and compound values for our analysis, adding it to our df.

The sentiment function of textblob returns two properties: polarity and subjectivity.

Polarity is a float which lies in the range of (-1,1) where 1 == a positive statement and -1 == a negative statement.
Subjectivity is also a float which lies in the range of (0,1) where 0 == an objective statement (factual) and 1 == a personal statement (opinion).

With this data, we can see how confident the analysis was in its classifications, and we can also see how factual/opinionated the tweets are - Are users mostly talking about personal experiences and their emotions, or news sources/studies?

Let's create one more piechart, for a visual.

Interesting. The percentages changed a little bit:

  1. positive: 33.6 -> 33.37%
  2. neutral: 21.2% -> 20.03%
  3. negative: 45.3% -> 46.6% Taking away the extra characters revealed some tweets were more negative than the analysis algorithm originally classified.

Let's investigate the polarity and subjectivity values by taking the average of each classification for each sentiment.

The polarity of all categories are actually surprisingly close to 0, or a "neutral" coded tweet. However, probably as expected, the "neutral" tweets are much less subjective (opinionated) than the negative/positive tweets.

I would assume the close-to-zero averages can be explained by the fact that a lot of these tweets are simply very different. #edtwt is a large community with a lot of representation. In fact, not everyone on edtwt even necessarily wants to lose weight, although that is the stereotype. According to this analysis, the positive tweets are more positive than the negative tweets are negative. I think this could have arisen from some misclassifications, as we saw earlier when we sampled a couple tweets from each sentiment category, but also the fact that because edtwt is such a negative place (as proven by the majority of tweets being negative), positive tweeters may go out of their way to spread love on the platform.

3: WordClouds

Let's next investigate which words are used the most in each sentiment with WordClouds.

This is pretty much expected. Majority of tweets mention eating disorders, anorexia (the most common ED in the US), and "thinspo," or "thin inspiration." Thinspo is typically the opposite of the earlier-described "fatspo" posts, where a photo of a exceptionally thin user is posted along with a caption - usually either simply "thinspo" or something along the lines of "I wish I looked like this." From my time scrolling through edtwt timelines, the posts are typically personal stories ("starting a fast today!", "i binged today, i feel like a failure", etc.) or thinspo.

Let's take a look at the positive tweets' composition.

Again, "eating disorder" and "anorexic" are used often, but the words are certainly different overall. "Love," "help," "eat", "friend," and "good," for example, promote a pro-recovery environment in my mind.

What about the negative tweets?

A little different from the positive tweets. "Body," "weight," "fat," "hate," and multiple swears show the side of edtwt that is actively struggling.

It's interesting that the words differ like this, but they certainly don't differ a lot. The phrase "eating disorder" is still prevalent in each of the wordclouds regardless of the sentiment, and there's always a large emphasis on specifically anorexia. All in all, whether the tweet is positive or not, as expected, edtwt is focused entirely on weight, appearance, and eating.

Let's also quickly compare the tweets' length and word count for each sentiment, to see if they vary at all.

Yes, the negative and positive tweets differ from the neutral tweets. The neutral tweets, as I expected, are mostly just captions to images - usually thinspo. The negative tweets are actually not very negative, but use language that can be coded as such: in the joking phrase "nobody's going to go punch a wall," signified to be a joke by the laughing emojis afterwards, the word "punch" is probably going to be coded as negative, although the main message of the tweet is actually correcting another user from not needing to put a trigger warning. The positive tweets, however, are more negative in my opinion: users being "forced" to eat, one positive tweet is actually misclassified as it references a singer named "Ana", not the shorthand of "anorexic."

All in all, the neutral tweets are captions, so they are shorter. The positive and negative tweets are replies to threads or personal stories, so not only are they more subjective, but they are also longer. interesting!

4: Vectors

For the last part of the analysis, I wanted to continue on the line of finding which words were used the most often, as I did with the WordClouds. For this part, I will use tokenization.

In order to count the words, we can use the CountVectorizer similar to the classification from Assignment 7.

Let's first find the most used single words.

Users discuss their disorders and personal experiences the most, as we've seen earlier. What about longer phrases? Let's find the most common 2- and 3-word phrases using an n-gram.

Just like the wordclouds, "eating disorder" is the most common. However, "pro ana" is also extremely high on the list, showing that the community is not very pro-recovery. Moreover, there are plenty of mentions of "feeling like" or "looking like" a certain "thinspo thread", so it's an environment that highly encourages competition and comparison. This is extremely dangerous for young, impressionable teenagers, who are coincidentally the most at-risk for developing eating disorders.

As for the 3-word phrases:

As I mentioned earlier, anorexia and bulimia are not the only eating disorders on edtwt. Binge eating disorder is a disorder in which a user consumes large amounts of food (usually far past the point of feeling full) in a short period of time, known as "binges." This disorder usually results from an "all or nothing" mentality and goes hand-in-hand with high restrictive practices in which the person restricts so heavily that they cave to the pressure and pysical/mental strain, therefore consuming as much "unallowed" food as possible before they must "reset" the next day.

There are also many mentions of abuse. Let's look into those:

Investigating these tweets, they are (in order) talking about a character from a tv series, a personal recovery story, a fanfiction, advocating for better healthcare, and a news story about a Colorado cult leader who died earlier this year. The tweets mentioning abuse are varied in content, in other words, and seem to be mostly reporting on other people (hypothetical, like in the healthcare tweet, or fictional, like in the tv series and fanfiction tweets). In other words, it doesn't seem like edtwt is being used to mostly talk about one's own abuse, but rather the abuse of others.

However, "eating disorder recovery" is also high on the list. This is promising, as it suggests a fair amount of users are either promoting or at least starting a dialogue about recovery. However, the community in general has still proven to be extremely toxic and dangerous, especially for young, vulnerable minds.

Part 6: Limitations

This data was taken over a 6-day period from Twitter. I chose to research Twitter not because I think it has a particularly awful community, but simply because of the time constraints and it was the easiest API to sign up for. Since a lot of twitter seems to be focused on sharing thinspo images or personal stories, I think Instagram and Tumblr would be good platforms to research as well, given that they were both founded on the sharing of images. Instagram deprecated their API in 2020 and Tumblr's API was very difficult to sign up for, but I think those two would absolutely offer new interesting insights into online ED communities.

Moreover, I just scraped data once to study the nature of the ED community. Eni suggested keeping a Twitter Stream API open, and monitoring which posts were taken down, so I could therefore monitor Twitter's response to the nature of the ED community. Again, due to time constraints, I was unable to figure out how to do that in time, but I think that is the logical next step in this research topic.

My final thought surrounding the design of my research approach is that 1) some irrelevant data was included, like the tweet about the singer named Ana, and 2) some tweets I think were misclassified as positive when they should be negative and vice versa. As for the first issue, I expected this, as edtwt users will purposefully use difficult-to-scrape hashtags and keywords to avoid detection from people like me. Nonetheless, I'm sure the query I used in the API can be further refined to combat this. As for the second issue, I think that's more an issue with the sentiment analyzing modules I chose to use, and therefore not much can be done other than using another module or classifying by hand, etc.

Part 7: Conclusions

In conclusion, I think #edtwt is a very dangerous, toxic, terrible online community. The majority of tweets are negative, often describing a user's self-hatred for how "fat" they are. Even the neutral tweets are shielded in negativity, as they are most often captions of dangerously thin women, urging young users to fast more, eat less, and be as thin as them someday. There are discussions of committing suicide, self-harm, starvation, suggestions to go on diets of 350 calories a day, and other terrible threads. I am exceptionally worried about the fact that the majority of Twitter users are young, impressionable teenagers who could be easily gaslit and influenced by harmful messages and communities like #edtwt.

Other studies like mine have been done before, and I think it is more than safe to conclude that hashtags like #proana and tweets containing keywords like "eating disorder" should be monitored. As I stated earlier, I think the next step would be to see how Twitter is monitoring these tweet feeds, and find suggestions on how to improve content deletion.